The aggregation params of metrics can be defined #844 #845

thompson-tomo · 2025-07-13T04:39:05Z

Closes: #844

This is an attempt to help enable semconv to move towards an automated metric registry by closing the gap between current documented information and what is supported on the models.

This work has been completed by following: https://github.com/open-telemetry/weaver/blob/main/docs/developer-guide.md, if anything is missing lets also add it to the document.

jerbly · 2025-07-13T12:08:24Z

Had a quick look. This appears to be a freeform yaml block. Could annotations be used instead? We recently backed-out of spec change for value_type in favour of annotations. This feels the same to me.

thompson-tomo · 2025-07-13T15:06:11Z

I did consider using annotations, but would rather be more explicit in this case and opted against a strongly typed option as custom aggregation types could be implemented.

We should discuss if there is a valid use case for allowing semconv to specify a non default aggregation type. If yes that would become a seperate string property alongside the hashmap. At the same time, should the output schema have the default parameters populated if not provided in input schema.

Also aggregation if buckets are different sizes will impact your metrics etc hence not just annotation.

I did this small to make it easier/quicker to review and enable more of semconv to be auto generated.

jsuereth · 2025-07-14T22:32:45Z

I don't think we make any requirement on delta vs. cumulative, but we should have a way to specify histogram boundaries.

lmolkova · 2025-07-16T14:30:28Z

crates/weaver_semconv/src/aggregation.rs

+pub struct AggregationSpec {
+    /// The parameters used in the aggregation
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub parameters: Option<HashMap<String, YamlValue>>,    


Discussed in the tooling call:

bucket boundaries are advisory params, changing them would not be breaking, so we should put them into annotations.

It'd still be useful to define the specific format, but not in the rust code. We can define specific format in the JSON schema, but we can also do it later and start by suggesting the format inside semconv repo.

with having it as advisory won't you run into issues/inaccurate data if you have multiple instruments generating measurements with different boundaries but using the same metric. Hence hard to see it as advisory but more a requirement. Also what about the aggregation type, that is also a requirment.

"won't you run into issues/inaccurate data if you have multiple instruments generating measurements with different boundaries but using the same metric"

Not when following best practices for histograms/distributions. in fact, our Exponential histograms are designed around bucket boundaries changing in the lifespan of the same time series.

The OpenTelemetry specification has explicitly allowed bucket boundaries to change during a timeseries lifespan. So "default advise" would not have this restriction, and many metric systems will handle this effectively (even prometheus if using functions like histogram_quantile()).

So, by default, weaver will not enforce this. If someone wanted to enforce this, weaver would ALLOW that via annotations and custom rego policies.

This aggregation object is for defining the properties based on the aggregation type being used. The supported settings are described in the spec at https://opentelemetry.io/docs/specs/otel/metrics/sdk/#aggregation

That document gives the impression that if you define explicit boundaries the sdk is instructed to use them.

I will make it more explicit by adding the type in there, that way it is possible to fall back to the defaults. That also provides a way to indicate it is exponential etc.

Note annotations are not emitted by weaver for group members/signal in resolved form.

Apologies for the delay in responding here. A few important points:

Note annotations are not emitted by weaver for group members/signal in resolved form.

This is a bug that should be fixed.

That document gives the impression that if you define explicit boundaries the sdk is instructed to use them.

Yes but there are two users of semconv here: The storage and visualization that uses the histogram and the instrumentation that produces it.

What we do NOT want is to pretend like there is a perfect set of boundaries in semconv for which all histograms should abide. We can provide a default to codegen. In reality, getting histogram boundaries right often needs specific knoweldge of the system being developed. Poor boundaries can lead to inefficient and inaccurate histograms for your services. This is why we have an "advice" API and want histogram boundaries as a "hint" to codegen vs. a first class thing. Additionally, this is why we're moving to exponential histograms, where you can more easily say "here's the resolution I want" and the histogram expands boundaries to fit the appropriate distribution.

Today - we took the approach the code generation + "advice" in Metrics should be a hint in weaver.

What we do NOT want is to pretend like there is a perfect set of boundaries in semconv for which all histograms should abide.

Agree it is bespoke to the metric and if not explicitly configured then the default is used which could be the metric default or the global default.

Additionally, this is why we're moving to exponential histograms, where you can more easily say "here's the resolution I want" and the histogram expands boundaries to fit the appropriate distribution.

For the exponential histogram do you see the scale and/or size properties as being general advice or requirements which should be followed?

Also what about aggregation method? Is this a requirement or advice?

The aggregation params of metrics can be defined open-telemetry#844

c31fdb4

thompson-tomo requested a review from a team as a code owner July 13, 2025 04:39

lmolkova reviewed Jul 16, 2025

View reviewed changes

thompson-tomo added 2 commits July 24, 2025 12:40

Add in aggregation method

71e1ca5

Merge branch 'main' into feature/#844_AddInAggregation

c74db42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The aggregation params of metrics can be defined #844 #845

The aggregation params of metrics can be defined #844 #845

Uh oh!

thompson-tomo commented Jul 13, 2025

Uh oh!

jerbly commented Jul 13, 2025

Uh oh!

thompson-tomo commented Jul 13, 2025 •

edited

Loading

Uh oh!

jsuereth commented Jul 14, 2025

Uh oh!

lmolkova Jul 16, 2025

Uh oh!

thompson-tomo Jul 17, 2025

Uh oh!

jsuereth Jul 22, 2025

Uh oh!

thompson-tomo Jul 22, 2025 •

edited

Loading

Uh oh!

jsuereth Aug 18, 2025

Uh oh!

thompson-tomo Aug 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

The aggregation params of metrics can be defined #844 #845

Are you sure you want to change the base?

The aggregation params of metrics can be defined #844 #845

Uh oh!

Conversation

thompson-tomo commented Jul 13, 2025

Uh oh!

jerbly commented Jul 13, 2025

Uh oh!

thompson-tomo commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsuereth commented Jul 14, 2025

Uh oh!

lmolkova Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

thompson-tomo Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

jsuereth Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

thompson-tomo Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsuereth Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

thompson-tomo Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thompson-tomo commented Jul 13, 2025 •

edited

Loading

thompson-tomo Jul 22, 2025 •

edited

Loading